Overview

Dataset statistics

Number of variables11
Number of observations156
Missing cells23
Missing cells (%)1.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory22.2 KiB
Average record size in memory146.0 B

Variable types

NUM10
CAT1

Reproduction

Analysis started2020-05-18 10:54:53.618074
Analysis finished2020-05-18 10:55:06.851020
Duration13.23 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Country (region) has a high cardinality: 156 distinct values High cardinality
Corruption has 8 (5.1%) missing values Missing
Log of GDP per capita has 4 (2.6%) missing values Missing
Healthy life expectancy has 6 (3.8%) missing values Missing
Country (region) is uniformly distributed Uniform
Ladder is uniformly distributed Uniform
SD of Ladder is uniformly distributed Uniform
Positive affect is uniformly distributed Uniform
Negative affect is uniformly distributed Uniform
Social support is uniformly distributed Uniform
Freedom is uniformly distributed Uniform
Corruption is uniformly distributed Uniform
Generosity is uniformly distributed Uniform
Log of GDP per capita is uniformly distributed Uniform
Healthy life expectancy is uniformly distributed Uniform
Country (region) has unique values Unique
Ladder has unique values Unique
SD of Ladder has unique values Unique

Variables

Country (region)
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE
Distinct count156
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Lebanon
 
1
Germany
 
1
Turkmenistan
 
1
Paraguay
 
1
Spain
 
1
Other values (151)
151
ValueCountFrequency (%) 
Lebanon 1 0.6%
 
Germany 1 0.6%
 
Turkmenistan 1 0.6%
 
Paraguay 1 0.6%
 
Spain 1 0.6%
 
France 1 0.6%
 
Finland 1 0.6%
 
China 1 0.6%
 
Bosnia and Herzegovina 1 0.6%
 
Uruguay 1 0.6%
 
Other values (146) 146 93.6%
 

Length

Max length24
Mean length8.211538462
Min length4
ValueCountFrequency (%) 
Lowercase_Letter 26 50.0%
 
Uppercase_Letter 23 44.2%
 
Close_Punctuation 1 1.9%
 
Space_Separator 1 1.9%
 
Open_Punctuation 1 1.9%
 
ValueCountFrequency (%) 
Latin 49 94.2%
 
Common 3 5.8%
 
ValueCountFrequency (%) 
ASCII 52 100.0%
 

Ladder
Real number (ℝ≥0)

UNIFORM
UNIQUE
Distinct count156
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.5
Minimum1
Maximum156
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.75
Q139.75
median78.5
Q3117.25
95-th percentile148.25
Maximum156
Range155
Interquartile range (IQR)77.5

Descriptive statistics

Standard deviation45.17742799
Coefficient of variation (CV)0.5755086368
Kurtosis-1.2
Mean78.5
Median Absolute Deviation (MAD)39
Skewness0
Sum12246
Variance2041
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
156 1 0.6%
 
49 1 0.6%
 
56 1 0.6%
 
55 1 0.6%
 
54 1 0.6%
 
53 1 0.6%
 
52 1 0.6%
 
51 1 0.6%
 
50 1 0.6%
 
48 1 0.6%
 
Other values (146) 146 93.6%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
156 1 0.6%
 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 

SD of Ladder
Real number (ℝ≥0)

UNIFORM
UNIQUE
Distinct count156
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.5
Minimum1
Maximum156
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.75
Q139.75
median78.5
Q3117.25
95-th percentile148.25
Maximum156
Range155
Interquartile range (IQR)77.5

Descriptive statistics

Standard deviation45.17742799
Coefficient of variation (CV)0.5755086368
Kurtosis-1.2
Mean78.5
Median Absolute Deviation (MAD)39
Skewness0
Sum12246
Variance2041
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
156 1 0.6%
 
49 1 0.6%
 
56 1 0.6%
 
55 1 0.6%
 
54 1 0.6%
 
53 1 0.6%
 
52 1 0.6%
 
51 1 0.6%
 
50 1 0.6%
 
48 1 0.6%
 
Other values (146) 146 93.6%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
156 1 0.6%
 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 

Positive affect
Real number (ℝ≥0)

UNIFORM
Distinct count155
Unique (%)100.0%
Missing1
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean78.0
Minimum1.0
Maximum155.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.7
Q139.5
median78
Q3116.5
95-th percentile147.3
Maximum155
Range154
Interquartile range (IQR)77

Descriptive statistics

Standard deviation44.88875137
Coefficient of variation (CV)0.5754968125
Kurtosis-1.2
Mean78
Median Absolute Deviation (MAD)39
Skewness0
Sum12090
Variance2015
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
127 1 0.6%
 
60 1 0.6%
 
51 1 0.6%
 
50 1 0.6%
 
101 1 0.6%
 
119 1 0.6%
 
20 1 0.6%
 
89 1 0.6%
 
11 1 0.6%
 
80 1 0.6%
 
Other values (145) 145 92.9%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 
151 1 0.6%
 

Negative affect
Real number (ℝ≥0)

UNIFORM
Distinct count155
Unique (%)100.0%
Missing1
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean78.0
Minimum1.0
Maximum155.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.7
Q139.5
median78
Q3116.5
95-th percentile147.3
Maximum155
Range154
Interquartile range (IQR)77

Descriptive statistics

Standard deviation44.88875137
Coefficient of variation (CV)0.5754968125
Kurtosis-1.2
Mean78
Median Absolute Deviation (MAD)39
Skewness0
Sum12090
Variance2015
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
152 1 0.6%
 
99 1 0.6%
 
51 1 0.6%
 
6 1 0.6%
 
45 1 0.6%
 
38 1 0.6%
 
35 1 0.6%
 
97 1 0.6%
 
113 1 0.6%
 
62 1 0.6%
 
Other values (145) 145 92.9%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 
151 1 0.6%
 

Social support
Real number (ℝ≥0)

UNIFORM
Distinct count155
Unique (%)100.0%
Missing1
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean78.0
Minimum1.0
Maximum155.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.7
Q139.5
median78
Q3116.5
95-th percentile147.3
Maximum155
Range154
Interquartile range (IQR)77

Descriptive statistics

Standard deviation44.88875137
Coefficient of variation (CV)0.5754968125
Kurtosis-1.2
Mean78
Median Absolute Deviation (MAD)39
Skewness0
Sum12090
Variance2015
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
148 1 0.6%
 
90 1 0.6%
 
28 1 0.6%
 
12 1 0.6%
 
91 1 0.6%
 
34 1 0.6%
 
53 1 0.6%
 
69 1 0.6%
 
71 1 0.6%
 
86 1 0.6%
 
Other values (145) 145 92.9%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 
151 1 0.6%
 

Freedom
Real number (ℝ≥0)

UNIFORM
Distinct count155
Unique (%)100.0%
Missing1
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean78.0
Minimum1.0
Maximum155.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.7
Q139.5
median78
Q3116.5
95-th percentile147.3
Maximum155
Range154
Interquartile range (IQR)77

Descriptive statistics

Standard deviation44.88875137
Coefficient of variation (CV)0.5754968125
Kurtosis-1.2
Mean78
Median Absolute Deviation (MAD)39
Skewness0
Sum12090
Variance2015
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
154 1 0.6%
 
81 1 0.6%
 
49 1 0.6%
 
45 1 0.6%
 
144 1 0.6%
 
126 1 0.6%
 
18 1 0.6%
 
47 1 0.6%
 
42 1 0.6%
 
57 1 0.6%
 
Other values (145) 145 92.9%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 
151 1 0.6%
 

Corruption
Real number (ℝ≥0)

MISSING
UNIFORM
Distinct count148
Unique (%)100.0%
Missing8
Missing (%)5.1%
Infinite0
Infinite (%)0.0%
Mean74.5
Minimum1.0
Maximum148.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.35
Q137.75
median74.5
Q3111.25
95-th percentile140.65
Maximum148
Range147
Interquartile range (IQR)73.5

Descriptive statistics

Standard deviation42.86801449
Coefficient of variation (CV)0.5754095905
Kurtosis-1.2
Mean74.5
Median Absolute Deviation (MAD)37
Skewness0
Sum11026
Variance1837.666667
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
61 1 0.6%
 
79 1 0.6%
 
96 1 0.6%
 
130 1 0.6%
 
30 1 0.6%
 
100 1 0.6%
 
92 1 0.6%
 
131 1 0.6%
 
68 1 0.6%
 
115 1 0.6%
 
Other values (138) 138 88.5%
 
(Missing) 8 5.1%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
148 1 0.6%
 
147 1 0.6%
 
146 1 0.6%
 
145 1 0.6%
 
144 1 0.6%
 

Generosity
Real number (ℝ≥0)

UNIFORM
Distinct count155
Unique (%)100.0%
Missing1
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean78.0
Minimum1.0
Maximum155.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.7
Q139.5
median78
Q3116.5
95-th percentile147.3
Maximum155
Range154
Interquartile range (IQR)77

Descriptive statistics

Standard deviation44.88875137
Coefficient of variation (CV)0.5754968125
Kurtosis-1.2
Mean78
Median Absolute Deviation (MAD)39
Skewness0
Sum12090
Variance2015
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
85 1 0.6%
 
39 1 0.6%
 
119 1 0.6%
 
83 1 0.6%
 
40 1 0.6%
 
105 1 0.6%
 
10 1 0.6%
 
42 1 0.6%
 
95 1 0.6%
 
102 1 0.6%
 
Other values (145) 145 92.9%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
155 1 0.6%
 
154 1 0.6%
 
153 1 0.6%
 
152 1 0.6%
 
151 1 0.6%
 

Log of GDP per capita
Real number (ℝ≥0)

MISSING
UNIFORM
Distinct count152
Unique (%)100.0%
Missing4
Missing (%)2.6%
Infinite0
Infinite (%)0.0%
Mean76.5
Minimum1.0
Maximum152.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.55
Q138.75
median76.5
Q3114.25
95-th percentile144.45
Maximum152
Range151
Interquartile range (IQR)75.5

Descriptive statistics

Standard deviation44.02272141
Coefficient of variation (CV)0.5754604105
Kurtosis-1.2
Mean76.5
Median Absolute Deviation (MAD)38
Skewness0
Sum11628
Variance1938
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
140 1 0.6%
 
33 1 0.6%
 
37 1 0.6%
 
27 1 0.6%
 
43 1 0.6%
 
62 1 0.6%
 
5 1 0.6%
 
86 1 0.6%
 
48 1 0.6%
 
38 1 0.6%
 
Other values (142) 142 91.0%
 
(Missing) 4 2.6%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
152 1 0.6%
 
151 1 0.6%
 
150 1 0.6%
 
149 1 0.6%
 
148 1 0.6%
 

Healthy life expectancy
Real number (ℝ≥0)

MISSING
UNIFORM
Distinct count150
Unique (%)100.0%
Missing6
Missing (%)3.8%
Infinite0
Infinite (%)0.0%
Mean75.5
Minimum1.0
Maximum150.0
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.45
Q138.25
median75.5
Q3112.75
95-th percentile142.55
Maximum150
Range149
Interquartile range (IQR)74.5

Descriptive statistics

Standard deviation43.44536799
Coefficient of variation (CV)0.5754353376
Kurtosis-1.2
Mean75.5
Median Absolute Deviation (MAD)37.5
Skewness0
Sum11325
Variance1887.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
143 1 0.6%
 
6 1 0.6%
 
41 1 0.6%
 
9 1 0.6%
 
68 1 0.6%
 
58 1 0.6%
 
70 1 0.6%
 
45 1 0.6%
 
61 1 0.6%
 
38 1 0.6%
 
Other values (140) 140 89.7%
 
(Missing) 6 3.8%
 
ValueCountFrequency (%) 
1 1 0.6%
 
2 1 0.6%
 
3 1 0.6%
 
4 1 0.6%
 
5 1 0.6%
 
ValueCountFrequency (%) 
150 1 0.6%
 
149 1 0.6%
 
148 1 0.6%
 
147 1 0.6%
 
146 1 0.6%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

Country (region)LadderSD of LadderPositive affectNegative affectSocial supportFreedomCorruptionGenerosityLog of GDP per capitaHealthy life expectancy
0Finland1441.010.02.05.04.047.022.027.0
1Denmark21324.026.04.06.03.022.014.023.0
2Norway3816.029.03.03.08.011.07.012.0
3Iceland493.03.01.07.045.03.015.013.0
4Netherlands5112.025.015.019.012.07.012.018.0
5Switzerland61144.021.013.011.07.016.08.04.0
6Sweden71834.08.025.010.06.017.013.017.0
7New Zealand81522.012.05.08.05.08.026.014.0
8Canada92318.049.020.09.011.014.019.08.0
9Austria101064.024.031.026.019.025.016.015.0

Last rows

Country (region)LadderSD of LadderPositive affectNegative affectSocial supportFreedomCorruptionGenerosityLog of GDP per capitaHealthy life expectancy
146Haiti147111142.0119.0146.0152.048.020.0138.0125.0
147Botswana14812587.065.0105.060.054.0150.066.0113.0
148Syria149137155.0155.0154.0153.038.069.0NaN128.0
149Malawi150132129.0110.0150.065.064.0109.0147.0119.0
150Yemen15185153.075.0100.0147.083.0155.0141.0124.0
151Rwanda1526354.0102.0144.021.02.090.0132.0103.0
152Tanzania15312278.050.0131.078.034.049.0125.0118.0
153Afghanistan15425152.0133.0151.0155.0136.0137.0134.0139.0
154Central African Republic155117132.0153.0155.0133.0122.0113.0152.0150.0
155South Sudan156140127.0152.0148.0154.061.085.0140.0143.0